This report includes a description and the results of the analyses requested by Elisa Jentho at 01/06/2021:
Plot marker genes upon integrated Cre3 and Lox2 Plasmodium chabaudi scRNA-seq samples with Seurat:
Markers_for_UMAP.xlsx: plot the list of markers in the overlapped UMAP of the integrated Cre3 and Lox2 samples.
More_markers.xlsx: plot the list of markers separately across the two UMAPs of the integrated Cre3 and Lox2 samples.
The R version used was 3.6.3 (R Core Team 2020). This report was built with the rmarkdown R package (v.2.5) (Allaire et al. 2020; Xie, Allaire, and Grolemund 2018; Xie, Dervieux, and Riederer 2020). See the full list of packages and versions used at the end of this report - R packages used and respective versions.
The integrated 10x Lox2 and Cre3 samples processed previously were imported to plot the markers upon the UMAPs with the Seurat R package (v.4.0.0) (Satija et al. 2015; Butler et al. 2018; Stuart et al. 2019; Hao et al. 2020).
The figures and tables displayed through the report can be download in pdf and tab-separated format by clicking on the bottom left Download plot and Download table button that appears after each plot and table, respectively.
## Set seed
set.seed(seed = 1024) # to keep reproducibility
## Import packages
library("dplyr", quietly = TRUE)
library("DT", quietly = TRUE) # package to print nice in the html report
library("Seurat", quietly = TRUE)
library("gplots", quietly = TRUE)
library("Matrix", quietly = TRUE)
library("ggplot2", quietly = TRUE)
library("tidyr", quietly = TRUE)
library("readxl", quietly = TRUE)
## Import Seurat object with the re3 & Lox2 integrated samples
# import obj integrated
seu <- readRDS(file = "../results/int/R_objects/seu.rds")
## Import marker lists
# marker lists to import
markers2import <- list.files(path = "../data/markers_to_plot", full.names = TRUE)
names(markers2import) <- c("int", "sep")
# "int": integrated/overlapped UMAP marker list | "sep": separated UMAP marker list
# import lists
markers <- list(
"int" = read_excel(path = markers2import[1], col_names = TRUE),
"sep" = read_excel(path = markers2import[2], col_names = FALSE)
)
# parse lists
markers$int = markers$int %>% select_if(is.character)
colnames(markers$sep) <- "Genes"
The two lists of markers can be found below.
To plot markers it was used the integrated Seurat object with the Cre3 and Lox2 samples. The assay used was RNA and the slot data with the counts normalized with the LogNormalize method (read more).
The individual plots of the cluster markers highlighted through the integrated UMAPs can be found at: results/markers/int/ (with each cluster folder holding their respective markers).
## Plot list of markers by cluster
# folder to save results
plot_markers <- "../results/markers"
if ( ! dir.exists(plot_markers) ) dir.create(plot_markers)
# list to save plots
marker_plots[["sep"]] <- marker_plots[["int"]] <- marker_plots <- list()
# check assay
stopifnot(DefaultAssay(seu) == "RNA")
for ( clt in colnames(markers$int) ) { # loop over cols/cluster markers
clt_name <- gsub(pattern = " ", replacement = "-", x = clt)
# create new folder for the cluster
plot_clt_folder <- paste(plot_markers, paste0("int/", clt_name), sep = "/")
if ( ! dir.exists(plot_clt_folder) ) dir.create(plot_clt_folder, recursive = TRUE)
marker_genes <- markers$int[,clt,drop=TRUE] # retrieve the marker genes for each col/cluster
# save to list
marker_plots$int[[clt_name]] <- list()
for ( mkr in marker_genes ) { # loop over marker genes within a cluster
if( ! is.na(mkr) ) { # plot them only if are not 'NA'
stopifnot( mkr %in% row.names(seu) ) # stop if gene does not exist in the Seurat obj
marker_plots$int[[clt_name]][[mkr]] <- FeaturePlot(object = seu, features = mkr,
slot = "data", reduction = "umap",
pt.size = 0.001)
ggsave(filename = paste(plot_clt_folder, paste0(clt_name, "_", mkr, "_gene_UMAP.pdf"), sep = "/"),
plot = marker_plots$int[[clt_name]][[mkr]], height = 5, width = 5)
}
}
}
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-0`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-1`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-2`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-3`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-4`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-5`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-6`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-7`)
cowplot::plot_grid(plotlist = marker_plots$int$`Cluster-8`)
To plot markers it was used the integrated Seurat object with the Cre3 and Lox2 samples. The assay used was RNA and the slot data with the counts normalized with the LogNormalize method (read more).
The individual plots of the markers split by sample highlighted through the integrated UMAPs can be found at: results/markers/sep/.
WARNING: The genes PCHAS-051500 and PCHAS-141240 provided as markers to plot were not found across the scRNA-seq data sets, and, thus they were not plotted.
## Plot list of markers by sample
# folder to save results
plot_sep_markers <- "../results/markers/sep"
if ( ! dir.exists(plot_sep_markers) ) dir.create(plot_sep_markers)
# Seurat: split objects by sample
sobjList <- SplitObject(seu, split.by = "orig.ident")
# check assay
stopifnot(DefaultAssay(seu) == "RNA")
stopifnot( c(DefaultAssay(sobjList$Cre3), DefaultAssay(sobjList$Lox2)) == "RNA" )
for ( mkr in markers$sep$Genes ) { # loop over marker genes within a cluster
#stopifnot( mkr %in% row.names(seu) ) # stop if gene does not exist in the Seurat obj
if ( mkr %in% row.names(seu) ) {
marker_plots$sep[[mkr]] <- list()
for ( sobj in rev(names(sobjList)) ) {
marker_plots$sep[[mkr]][[sobj]] <- FeaturePlot(object = sobjList[[sobj]], features = mkr,
slot = "data", reduction = "umap",
pt.size = 0.001)
ggsave(filename = paste(plot_sep_markers, paste0(sobj, "_", mkr, "_gene_UMAP_split_by_sample.pdf"), sep = "/"),
plot = marker_plots$sep[[mkr]][[sobj]], height = 5, width = 5)
}
}
}
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-021240`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-121330`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-100390`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-140270`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-143360`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-110810`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-040720`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-031680`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-122070`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-142660`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-146340`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-072660`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-081890`)
cowplot::plot_grid(plotlist = marker_plots$sep$`PCHAS-113160`)
Folders inside the project folder:
results: folder that contains all the results obtained in this analysis;
report: folder that contains the report and code used herein;
data: folder that contains the data used herein;
scripts: folder that contains the scripts used herein;
info: folder that contains some useful information (i.e., papers, etc) used herein;
## R packages and versions used in these analyses
sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] readxl_1.3.1 tidyr_1.1.2 ggplot2_3.3.2 Matrix_1.3-2
## [5] gplots_3.1.1 SeuratObject_4.0.0 Seurat_4.0.0 DT_0.16
## [9] dplyr_0.8.5
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-152 bitops_1.0-6 matrixStats_0.57.0
## [4] RcppAnnoy_0.0.18 RColorBrewer_1.1-2 httr_1.4.2
## [7] sctransform_0.3.2 tools_3.6.3 R6_2.5.0
## [10] irlba_2.3.3 rpart_4.1-15 KernSmooth_2.23-20
## [13] uwot_0.1.10 mgcv_1.8-35 lazyeval_0.2.2
## [16] colorspace_1.4-1 withr_2.4.1 tidyselect_1.1.0
## [19] gridExtra_2.3 compiler_3.6.3 plotly_4.9.3
## [22] labeling_0.4.2 caTools_1.18.0 scales_1.1.1
## [25] lmtest_0.9-38 spatstat.data_1.7-0 ggridges_0.5.3
## [28] pbapply_1.4-3 goftest_1.2-2 spatstat_1.64-1
## [31] stringr_1.4.0 digest_0.6.27 spatstat.utils_2.0-0
## [34] rmarkdown_2.5 pkgconfig_2.0.3 htmltools_0.5.1.1
## [37] parallelly_1.22.0 fastmap_1.1.0 htmlwidgets_1.5.3
## [40] rlang_0.4.10 shiny_1.6.0 farver_2.0.3
## [43] zoo_1.8-8 jsonlite_1.7.2 crosstalk_1.1.0.1
## [46] gtools_3.8.2 ica_1.0-2 magrittr_2.0.1
## [49] patchwork_1.1.1 Rcpp_1.0.6 munsell_0.5.0
## [52] abind_1.4-5 reticulate_1.18 lifecycle_0.2.0
## [55] stringi_1.5.3 yaml_2.2.1 MASS_7.3-54
## [58] Rtsne_0.15 plyr_1.8.6 grid_3.6.3
## [61] parallel_3.6.3 listenv_0.8.0 promises_1.1.1
## [64] ggrepel_0.9.1 crayon_1.3.4 deldir_0.2-9
## [67] miniUI_0.1.1.1 lattice_0.20-44 cowplot_1.1.1
## [70] splines_3.6.3 tensor_1.5 knitr_1.30
## [73] klippy_0.0.0.9500 pillar_1.4.7 igraph_1.2.6
## [76] future.apply_1.7.0 reshape2_1.4.4 codetools_0.2-18
## [79] leiden_0.3.7 glue_1.4.2 evaluate_0.14
## [82] data.table_1.13.2 vctrs_0.3.6 png_0.1-7
## [85] httpuv_1.5.5 cellranger_1.1.0 polyclip_1.10-0
## [88] gtable_0.3.0 RANN_2.6.1 purrr_0.3.4
## [91] scattermore_0.7 future_1.21.0 assertthat_0.2.1
## [94] xfun_0.19 mime_0.9 xtable_1.8-4
## [97] later_1.1.0.1 survival_3.2-11 viridisLite_0.3.0
## [100] tibble_3.0.6 cluster_2.1.2 globals_0.14.0
## [103] fitdistrplus_1.1-3 ellipsis_0.3.1 ROCR_1.0-11
Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2020. Rmarkdown: Dynamic Documents for R. https://github.com/rstudio/rmarkdown.
Butler, Andrew, Paul Hoffman, Peter Smibert, Efthymia Papalexi, and Rahul Satija. 2018. “Integrating Single-Cell Transcriptomic Data Across Different Conditions, Technologies, and Species.” Nature Biotechnology 36: 411–20. https://doi.org/10.1038/nbt.4096.
Hao, Yuhan, Stephanie Hao, Erica Andersen-Nissen, William M. Mauck III, Shiwei Zheng, Andrew Butler, Maddie J. Lee, et al. 2020. “Integrated Analysis of Multimodal Single-Cell Data.” bioRxiv. https://doi.org/10.1101/2020.10.12.335331.
R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Satija, Rahul, Jeffrey A Farrell, David Gennert, Alexander F Schier, and Aviv Regev. 2015. “Spatial Reconstruction of Single-Cell Gene Expression Data.” Nature Biotechnology 33: 495–502. https://doi.org/10.1038/nbt.3192.
Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck III, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177: 1888–1902. https://doi.org/10.1016/j.cell.2019.05.031.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.